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Abstract 

We prove that for a minimal rotation T on a 2-step nilmanifold 
and any measure u, the push- forward T™/i of /i under T n tends toward 
Haar measure if and only if /i projects to Haar measure on the maximal 
torus factor. For an arbitrary nilmanifold we get the same result along 
a sequence of uniform density 1. These results strengthen Parry's result 
[8] that such systems are uniquely ergodic. Extending the work of 
Furstenberg [3J, we get the same result for a large class of iterated 
skew products. Additionally we prove a multiplicative ergodic theorem 
for functions taking values in the upper unipotent group. Finally, we 
characterize limits of T"/x for some skew product transformations with 
expansive fibers. All results are presented in terms of twisting and 
weak twisting, properties which strengthen unique ergodicity in a way 
analogous to how mixing and weak mixing strengthen ergodicity for 
measure preserving systems. 

1 Introduction 

By a topological dynamical system we shall mean a compact metric space X 
equipped with a homeomorphism T : X — >■ X (all systems will be assumed 
invertible.) We will denote such a system by a pair (X,T). Given two 
systems, (X,T) and (Y, S), a factor map $ : (X, T) — > (Y, S) is a surjective 
continuous map $ : X — > Y such that $>(T(x)) = S($(y)). Consider the 
following easy rephrasing of well known results from ergodic theory. (For 
the basic definitions of ergodic theory, see [9].) 

Proposition 1.1. Let (X,T) be a topological dynamical system and let m 
be an invariant probability measure on X. 
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1. m is ergodic if and only if for all f/,, absolutely continuous with respect 
to to, and for all f G C(X) 



N-l 

lim — V / / o T n du - / fdm = 0. 
N f^Jx Jx 



2. to is weakly mixing if and only if for all fi, absolutely continuous with 
respect to to, and for all f G C{X) 



1 N ~! 

lim — > 

n=0 



/ o T n d[l 



X 



fdm 



x 



0. 



3. to is mixing if and only if for all fi, absolutely continuous with respect 
to m, and for all f G C(X) 

lim / / o T n d[i - I fdm = 0. 
Jx Jx 

Proof. Let dfi = ipdm where ijj G L 1 (to) and let ipM = min(ip,M). Then 
f,ipM £ L 2 (m), so in statements 1,2, and 3, the Hilbertian definitions of 
ergodicity, weak mixing, and mixing dictate that the appropriate limits hold 
when 

/ f ° Tnd[l 18 replaCed ^ / f ° T ^ Mdm = if ° ^ ■ 

Now we use the fact that / is bounded to pass to a limit in M and derive 
the same results for J f o T n d[i. 

For the converses, we observe L 1 (to) contains l?(m) and C(X) is dense 
in L 2 (to). The Hilbertian definitions follow immediately from the limits 
above. □ 

The goal of this paper is to study similar averages and limits where \x 
has been replaced by some probability measure that is singular with respect 
to m. More specifically, if (X, T) is a topological system with unique invari- 
ant measure to, we study those \x for which we could expect the limits in 
Proposition 11.11 to hold. 

In Proposition II. 1( using test functions / from C(X) seems unnatural 
since the topology of X is irrelevant to the usual definitions of ergodicity, 
weak mixing, and mixing. One usually makes similar statements in l?{m) 
where the integral J fdfi against an absolutely continuous measure dji = 
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(pdm is replaced by an inner product: f fd/i = j ftpdm = (/, <p) . However, 
when one wishes to study singular fj,, there is no obvious analogue of <p, so 
L 2 (m) is insufficient. 

One can rephrase Proposition 11.11 in terms the space P{X) of Borel 
probability measures on X. Considered as a subspace of C(X)*, P(X) may 
be equipped with the weak* topology. In particular Proposition ll.il (3) can 
be rewritten: m is mixing if and only if for all /j, absolutely continuous with 
respect to m, lmin^oo T™/i = m, where T"/z is the push- forward measure 
defined by / fdT?n = Jfo T n dfi. 

Before stating our results regarding singular fj,, we recall some prelimi- 
naries. A nilmanifold is a space X of the form G/T where G is a connected 
nilpotent Lie group and T is a cocompact lattice. That is, T is a discrete 
subgroup of G such that the quotient space G/T is compact. G acts on X 
by left multiplication. Such groups G admit a bi-invariant Haar measure 
m. By identifying X with a fundamental domain for the left action of G 
on itself, we can equip X with a finite measure. Since m is unique up to 
scaling, we may assume the measure on X is a probability measure. For 
simplicity we write m for the measure G on X and call both Haar measures. 
The left action of G on X preserves m. We say a sequence of measures \x n 
on X equidistributes if for all / € C(X), 



n— >oo 



lim j f/j, n = j fdm. 



Write Go = G and G n+ \ = [G,G n ], Then G n+ \ < G n , and G n /G n+ \ is 
abelian and connected. The sequence of subgroups G n is called the lower 
central series of G. Since G is nilpotent there exists some d such that Gd = 
{1}. The least such d the called the degree of nilpotency of G (and X.) The 
lower central series of G gives us a sequence of quotients X n := G/G n T of X. 
The fiber above each point in the factor map X n+ \ — > X n is homeomorphic 
to 

G n /G n+ iT = (G n /G n+ \)/((T n G n )G n+ i/G n+ i), 

which is a compact quotient of G n /G n+ \ and hence a torus. In fact X n+ \ is 
a torus bundle over X n . Therefore X is derived from the one point space Xj 
by repeatedly taking circle bundles. We call X\ the maximal torus factor of 
X. 

Let J be a set of integers. When the following limits exist 

:_ lim #(Jn[t.M) and d . := lim M^npi 

V ; N^oo N V ' N-M^oo N-M + l 
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we refer to their values as the density and uniform density of J. If d*(J) is 
well defined then so is d(J) and the two coincide. 

Theorem 1.2. Let X = G/T be a nilmanifold. Fix u G G and let T(gT) = 
ugT. If the system (X,T) is transitive (i.e. has a dense orbit) then for any 
probability measure fi on X that projects to Haar measure on the maximal 
torus factor, there exists a subset J C 7L of uniform density zero such that 
for any f € C(X) 



In other words, the sequence {T™fi : n ^ J} equidistributes. 

Furthermore, if one fixes f and lets range then the following limit 
converges uniformly in \i. 



When we say [i projects to Haar measure on the maximal torus factor 
we mean that 7r*/i is Haar measure on X\ , where tt : X — > X\ is the obvious 
factor map. 

Theorem 1.3. Let (X,T) and [i be as in Theorem \1.2L If we assume that 
G is a 2-step nilpotent group, then the limit holds with no exceptional set J: 



In other words, T™[i equidistributes. 

As will be discussed in the next section, the conclusions of Theorems II. 21 
and 11.31 imply unique ergodicity. So, these theorems strengthen the result 
of Parry [8] which asserts that such systems are uniquely ergodic. 

While nilmanifolds (spaces G/T as in Theorem ll.2p are not usually Tori, 
their topologies are locally similar because, as mentioned above, nilmani- 
folds can be constructed by repeatedly taking circle bundles. Combining 
techniques used in the proof of Theorem 11.21 with a multiplicative ergodic 
theorem (Theorem II .6p for unipotent valued cocycles, we get 

Theorem 1.4. Let X = T d and define T : X ->• X by 

T(xi, . . . ,x d ) = (xi + a,x 2 + fi(xi), ...x d + fd-x{xi, ■ ■ ■ ,x d -i)), 





N-l 




fdm. 
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where a is irrational and each fk(%i, ■ ■ ■ •) : T — s> T is Lipschitz and 

homotopically non-trivial. For any probability measure [i on X that projects 
to Lebesgue measure on the first coordinate there exists a subset J C Z of 
uniform density zero such that for any f £ C(X) 



lim / / o T n d\i = / fdm. 

n-*oo, n<£J J x J X 



In other words, the sequence : n £ J} equidistributes. 

Furthermore, if one fixes f and lets /i range then the following limit 
converges uniformly in \x. 



lim 



1 



N-M->oo N — M 



N-l 

^ / foTdfi- I 

i=M 



fdm 



Theorem 1.5. Let (X,T) be as in Theorem ] 1 .4\ with d = 2 (i.e. T(x,y) = 
(x + a,y + f{x)).) Then the limit holds with no exceptional set: 

lim / foT n dfi= / fdm. 
n ^°°Jx Jx 

Furstenberg proves in [3] (Theorem 2.1) that such systems are uniquely 
ergodic. Our theorem is a direct extension of his. 

The multiplicative ergodic theorem alluded to above is one of the most 
significant results in this paper, so we include it here. Let U be the group 
of upper triangular dx d matrices with entries in M. Then U is admits a one 
parameter family 9t of dilations given by 



/ 1 










Ul,2 
1 



"1,3 
■"2,3 

1 



U2,d 








Ud-l,d 

1 



/ 1 





1 tu 2> 3 



u l,d \ 

d-2„, 
U2,d 











1 
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More formally {9t{u))i,j = t :1 ~' l Ui t j for j > i. It is not hard to check that 
each 9t is an automorphism of U. In fact 9 : 1 1— > Ot is a homomorphism from 
the semigroup ((0, oo), x) into the automorphism group of U. If we equip 
the latter group with the topology of uniform convergence on compact sets, 
then 9 is continuous. 
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Theorem 1.6. Let (X,B,m,T) be a probability measure preserving system 
and suppose f : X — > U is bounded and measurable. Then there exists 
f*:X^-U (also bounded, measurable) such that for almost every igl 



k=i UA 

where A = X(j — i) is some positive constant depending only on j — i. 

Just as Birkhoff 's point-wise theorem becomes stronger in the uniquely 
ergodic case, so does Theorem 11.61 

Theorem 1.7. Suppose (X,T) is uniquely ergodic and f : X — > U is con- 
tinuous. Then 



converges uniformly to the constant given in Theorem ] 1. (A 

Finally we prove some partial results on the behavior of measures under 
pushforward in systems like (x + a, 2y + f(x)) which are neither isometric 
extensions, nor iterated isometric extensions over their maximal equicon- 
tinuous factor, and hence are of a fundamentally different character from 
the systems discussed above. These theorems prove the existence of some 
interesting T*-invariant probability measures on P(X), which, as we will see 
in the next section, are pertinent to the asymptotic behavior of T™/i. 

2 Twisting and weak twisting 

Before proceeding to the proofs, we introduce some new definitions (twisting 
and weak twisting) which provide a nice abstract perspective on the results. 
We wish to build an analogy between the triples ergodicity, weak mixing, 
strong mixing, and unique ergodicity, weak twisting, twisting. 

Suppose (X, T) is a transitive topological system (i.e. has a point with 
dense orbit.) We can define T* : P(X) — > P(X) by the pushforward: 



hm 6 l/n {f{T n - l x) ■ ■ ■ f{T 2 x)f{Tx)f{x)) = f*(x). 




hm 6 1 / n (f(T n ~ 1 x) ■ ■ ■ f(T 2 x)f(Tx)f(x)) 



T±ii{A) := fi(T A), or equivalently 




6 



This makes (P(X),T*) into a compact topological system. Portions of this 
system have been studied by Glasner in [4], who introduced measure the- 
oretic quasi- factors (certain invariant probability measures on P(X). This 
avenue of research was furthered by Glasner and Weiss in [6]. 

An invariant measure m on X appears in this system as a fixed point. 
We will study the basin of attraction of m (that is, the set of \x for which 
T™/j, — > m. As explained in Proposition II. 1\ m is mixing if and only if every 
probability measure /j, absolutely continuous with respect to m attracts to 
m. So, if m has full support and is mixing then its basin of attraction is 
dense. If all of P (X) attracts to m then, in particular 5 X attracts to m for 
each x G X. But this implies that m is itself a point mass 8 XQ . It follows 
that T n (x) must tends to xq for every x. From most perspectives, this is 
not very interesting. Indeed, from a measure theoretic perspective, (X, T) 
is equivalent to the one point system. 

What, then, is the largest closed invariant subset P' of P(X) for which 
it is reasonable to ask if all fi in P' attract to ml Notice that a convex 
combination of measures attracting to m also attracts to m. So we may as 
well consider only convex P' . 

Let K = K(X,T) be the maximal equicontinuous factor of (X, T). 
Specifically, K is the maximal ideal space of the algebra 

{/ G C(X) : {/ o T n : n G Z} is compact }. 

We will also denote the induced transformation on K by T. The above 
definition is convenient for our purposes (for instance it makes it clear that 
K is a functor.) For other definitions and further discussion, see [5]. One 
important fact we will use is that a metric may be chosen for K(X,T) 
such that T acts by isometries. Transitivity, equicontinuity, and invertibility 
together imply minimality. So K is a compact abelian group and T{x) = ax 
for some fixed a. G K (see for instance [9]). 

Write nix for normalized Haar measure on K. Since m was assumed 
to be invariant on X, it projects to an invariant measure on K. Unique 
ergodicity of (K,T) tells us m must project to tuk- Suppose some /J, G P' 
projects to a measure v on K different from mx- It is easy to see v does not 
attract to itlk (later we will put a metric on P(K) with the property that 
T± is an isometry, so the distance from T£v to mx is independent of n.) 
Since v does not attract to mx, it follows that fx does not attract to m. So, 
for our purposes, we need only consider P' contained in the following set. 

Definition 2.1. We write Pi = Pi(X,T) for the set of all probability mea- 
sures on X which project to txik- This is a convex closed nonempty invariant 
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subset of P(X). So, (Pi,T lt |p 1 ) is a topological dynamical system. 
Definition 2.2. We will call (X,T) twisting if 

for all fi E P\, lim T™(//) = m. 

n— >oo 

Loosely speaking, a system is twisting if every probability measure which 
conceivably could, equidistributes under repeated application of T± (i.e. un- 
less it is prohibited from doing so by the maximal equicontinuous factor.) 

In the spirit of treating (P(X),T+) as a topological system, it is inter- 
esting to study its invariant measures. We know of one invariant measure: 
6 m . The argument we will use to prove Proposition 12.61 part 2 shows that m 
is weakly mixing if and only if every measure [i absolutely continuous with 
respect to m is a typical point for S m . That is, for any F € C{P{X)) 

1 N ^ r 

lim - £ TO) = / FdS m = F(m). 

n=0 J 

So, if m has full support and is weakly mixing, then there is a dense set of 
measures fi £ P(X) which are typical points for the invariant measure 5 m 
on the system (P(X),T±). Is it possible that every fi is a typical point for 
<5 m ? Equivalently, is (P(-X"),T*) ever uniquely ergodic? 

The answer is obviously no. Notice that (X, T) is a subsystem of (P(X),T ir ) 
where the inclusion is given by l(x) = 5 X . Our invariant measure m on X 
gives us an invariant measure on P(X) in two ways. Certainly we have 5 m . 
But we also have t*(m) = l+(J x 5 x dm) = j x 5s x dm. Unless X is a single 
point, these are distinct. In other words, the only way (P(X),T+) can be 
uniquely ergodic is if (X, T) is already the trivial system. 

By analogy to the discussion of attracting fixed points, one might wonder 
what is the largest subsystem P' C P{X) for which it is reasonable to ask 
if every point is typical for 8 m l Such measures don't necesarilly tend to m, 
but they do spend most of their time near m in the sense of uniform density 
(see Corollary 12.81 ) Equivalently, what is the largest subsystem P' C P{X) 
for which it is reasonable to ask if (P', T*) is uniquely ergodic? 

Suppose some \i £ P' projects to v ^ rriK- Then T^v avoids some weak* 
neighborhood of m#-. If we average along the orbit of v as in the usual 
proof of the existence of invariant measures (that is, we take a weak* limit 
of iV -1 J2n=o ^Tfi/j) w e get a measure different from S mK . Indeed, the two 
measures can not be the same because they have disjoint support. It follows 
that any weak* limit of iV -1 ^2^=o ^T n fi is different from 5 m . In particular 
(P',r*) is not uniquely ergodic. As before, we conclude that, at the very 
least, we must require P' C P\. 
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Definition 2.3. We call (X,T) weakly twisting if (Pi,T*) is uniquely er- 
godic. This is equivalent to requiring that each /i € P\ is a typical point for 



The next proposition follows immediately from the definitions given 
above. 

Proposition 2.4. 1. Unique ergodicity of(X,T) is equivalent to the ex- 
istence of a unique fixed point in P(X) (or in P\{X).) 

2. Weak twisting is equivalent to the existence of a unique fixed point in 



3. twisting is equivalent to the existence of a unique universal attracting 
fixed point in P\(X). 

Proposition 2.5. Twisting implies weak twisting. Weak twisting implies 
unique ergodicity. 

Proof. If T™/u tends to m then fi is a typical point for 5 m . This gives the first 
implication. Since invariant measures mi,m2 on X gives rise to invariant 
measures 5 mi ,5 m2 on P\{X), we see that unique ergodicity of P\ implies 
unique ergodicity of X. This proves the second implication. □ 

As we said at the beginning of this section, we wish to build an analogy 
between the triples ergodicity, weak mixing, mixing, and unique ergodicty, 
weak twisting, twisting. The following characterizations should make that 
analogy clear (compare to Proposition II. lj) . 

Proposition 2.6. Let (X, T) be a topological dynamical system and let m € 
P(X) be an invariant measure. 

1. (X,T) is uniquely ergodic if and only if for all \i £ P\ and for all 



2. (X, T) is weakly twisting if and only if for all fi 6 P\ and for all 



P(Pl(X)) (orinP^P^X)).) 



f € C(X) 




N-l 



f G C(X) 




N-l 
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3. (X,T) is twisting if and only if for all /i € Pi and for all f E C(X) 



The only difference between Propositions 11.11 and 12.61 is that, in each 
case, the assumption that fj, is absolutely continuous with respect to m has 
been replaced by the assumption that \x lie in Pi. We should expect that 
usually Pi contains many measures which are not absolutely continuous with 
respect to m. 

It would probably be beneficial for the reader to keep in mind the simple 
motivating example T(x,y) = (x + a,y + x) on T 2 . Here, a is some irra- 
tional number. Two dimensional Lebesgue measure is the unique invariant 
measure for this minimal system (see [3].) The class Pi contains many sin- 
gular measures. It includes, for instance, one dimensional Lebesgue measure 
supported on a horizontal line {(x,yo) ■ x E T}. Using harmonic analysis, 
it is not difficult to prove that this measure equidistributes under repeated 
application of T. We leave this proof to the reader, and derive the result, 
instead, from the more complicated, but significantly more general Theorem 



One may object that in Proposition 12.61 (1) the assumption that /x lie 
in Pi is not necessary. However, for parts (2) and (3) it is obvious that 
this assumption is unavoidable (this is the content of the discussion above 
involving reasonable choices of P'.) In part (1) we assume /U 6 Pi to reinforce 
the similarity between Proposition 12.61 and 11.11 

In the proof of Proposition 12.61 and in future propositions we call on the 
following well known fact. The proof is easy and is left to the reader. 

Lemma 2.7. A sequence x n of non-negative real numbers satisfies 




That is. 



lim T. a = m. 



IPl 




if and only if 



lim 

n— >oo,nS J 



= 



n=l 



for some J C N of density 1. Similarly, 




lim 



for some J C Z of uniform density 1 . 
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Proof of Proposition \2.6l Assume m is the unique invariant measure on (X, T) . 
It is well known that this is equivalent to assuming that 



N-l 

lim — V f(T n x) - / fdm = 
N ^ N v > j x J 



n=0 

for all x G X and all / G C(X). Integrating with respect to fx and applying 
the dominated convergence theorem yields the convergence of the average in 
(1). Conversely, suppose this average converges for all jx G P\. All invariant 
measures lie in P\. So, in particular, this average converges if ll is invariant. 
In this case the expression immediately degenerates into j fdfi = f fdm 
which implies fi = m. 

Now we prove (2). Assume (X,T) is weakly twisting. Fix / G C{X) 
and define F G C(P\) by F(li) = \ j fdfj, — j fdm\. Since 5 m is the unique 
invariant measure on (Px,T*) we see that for any /u G Pi, the limit in (2) is 
equal to 



l N ^ r 

lim - ^ F(T» = / Fd5 m = F{m) = 0, 



n=0 

To see the converse, notice that if the limit in (2) holds then for any / and 
any e > then by Lemma [2.7l we can find a sequence of density 1 along which 
| j fdT^Li — J fdm\ < e. Intersecting finitely many such sequences proves 
that the set of n G N for which T™li lies in a given weak* neighborhood of m 
has density 1. Obviously, then fi is a typical point for the invariant measure 
5 m on Pi. 

For (3) there is nothing to prove. This is the definition. □ 

The numerous characterizations of unique ergodicity give rise to charac- 
terizations of weak twisting. 

Corollary 2.8. With (X,T) and m as in Proposition \2.6\ weak twisting 
is equivalent to the assertion that for each fi G P\ there exists a sequence 
J C N of uniform density 1 such that for all f G C(X), lim ne j T™fi = m. 
(Moreover, this is never true when fi ^ P\.) 

Weak twisting is also equivalent to the convergence of the uniform aver- 
ages 



l N-l 
lim — — > 

i=M 



f o Tdfi - / fdm 



0. 



Furthermore, if one fixes /, this convergence is uniform in [i. (Again, this 
is never true when [i ^ P\.) 
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Proof. It is a standard fact that for a uniquely ergodic system, the conver- 
gence of iV -1 ^2^=0 f(T n x) is uniform in x (see for instance [9].) In particu- 
lar replacing x by T~ M x does not change the rate of convergence. Applying 
this to the function F defined in the proof of Proposition 12.61 gives uniform 
convergence of the uniform averages. The second equivalence follows from 
Lemma 12.71 

When [i ^ Pi we follow the argument given at the beginning of this 
section regarding choices for P'. Specifically let v be the projection of fi 
onto K = K(X,T). The norm || • ||* (which will be defined at the beginning 
of the next section) induces the weak* topology on P(K) and, under this 
norm, T± acts by isometries. Therefore, v ^ rn^, T™v is bounded away 
from rriK (Haar measure on K) independently of n. It follows that T"/x is 
bounded away from m. □ 

Example 2.9. Minimal equicontinuous systems are twisting because Pi = 
{m}. 

Example 2.10. There are systems which are uniquely ergodic but not 
weakly twisting. For instance, choose a (non-trivial) weakly mixing mea- 
sure preserving system and use the Jewett-Krieger theorem (see, for in- 
stance, [2]) to construct a uniquely ergodic topological realization (X,T). 
Since (X, B, fj,, T) has no measure theoretic Kronecker factor, its maximal 
equicontinuous factor must be the one point system. This tells us that 
P\{X) = P(X), which cannot be uniquely ergodic unless X is one point (as 
discussed above.) 

This example is not terribly satisfying. Its construction relies on the 
(opaque) Jewett-Krieger theorem which produces systems on totally discon- 
nected spaces. See Question 14.61 

Example 2.11. So far, the explicit examples we have seen of invariant 
measures on P\{X) have all been of the form <5„ where /U is some invariant 
measure on X. Given some collection M of invariant measures and some 
measure 6 on M we can always define r\ = j M 5^d9 ([/,) and obtain another 
invariant measure on Pi. These are the trivial examples. For an invariant 
measure on Pi which does not arise as a convex combination of point-masses 
at fixed points see Example 15.51 

Example 2.12. Now we give an example of a system which is weakly twist- 
ing but not twisting. Choose a subset A C Z having uniform density 1 and 
an infinite complement. Identify A with a point a € {0, 1} Z . Write a for the 
shift (ax(n) = x(n + 1)) and let X be the closure of {a n (a) : n G Z}. Notice 
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X contains the shift invariant point (...,1,1,1,...) =: 1, and therefore ad- 
mits the invariant measure 8\. It follows that the maximal equicontinuous 
factor of (X, a) is the one point system. So, Px(X, a) = P{X). 

The sets corresponding to the points of X all have uniform density 1 and 
that density is 'achieved uniformly'. That is, for any e > there exists a ./V 
such that for all x 6 X, if b — a > N then 



1 a 

- — y 



x(i) > 1 — £. 

i=b 



Fix M > and let C7 = {x G X : x(«) = 1, — M < i < M} be a small 
neighborhood of 1. It follows from the work above, that for all e > there 
exists N' such that for any x £ X and any b — a > N' , 



i=b 

Let neP 1 = P(X). Then 



i=6 i=6 

Since J x xu(^ l x)dfj,(x) = a\[i{U) < 1, we get 



6-o+_ , 

2=0 



But e was arbitrary and U was an arbitrary cylindrical neighborhood of 1. 
So, if / G C(X) then 



1 - /" 

i=l ^ 



It follows from Proposition 12.61 that (X, u) is weakly twisting and 5s ± is the 
unique invariant measure on Pi. However, (X, T) is not twisting. Indeed, 
choose rtj € Z \ ^4 tending to infinity, let / be the (continuous) characteristic 
function of {x € X : x(0) = 1}, and let fj, = S a . Then 

1 fda^fi = f(a n >a) = a(n t ) = 0^1 = /(l). 
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Question 2.13. Do there exist topological systems admitting an invariant 
measure of full support, which are weakly twisting but not twisting? 

This question, the related Question 13.51 and Question 14.71 are, in the 
view of the author, the most interesting unresolved problems in this paper. 

It is trivial (but pleasing) that P\ is a functor. To see this, suppose 
<]? : (X, T) — > (Y, S) is a factor map of topological systems and write irx, 7Ty 
for the projections X — > K(X, T), Y — > K(Y, S) respectively. Then ixy ° 3> 
is a factor map. By maximality of K(X, T) we see this map must factor as 
tty o $ = <& K o nx for some unique map &k ■ K(X, T) — > K(Y, S). In other 
words K is a functor. So 

(ttyU^Pi(X,T)) = ($^((7rx)*Pi(X,r)). 

which proves $*(Pi(X,T)) C P X (Y, S). Write $i = ^Ap 1 {x,t)- That $ h-> $ x 
respects composition follows from the same statement about $ i— >■ ^j,. 

One subtle thing that should be verified is that $i is surjective (since 
factor maps of topological systems are required to be surjective.) Fix v G 
Pi(Y). Think of C(Y), and C(K(X,T)) as subspaces of C(X) and let v' 
be the functional on C(Y) + C(K(X,T)) which agrees with v on C(Y) and 
Haar measure on C(K(X,T)). Apply the Hahn-Banach Theorem to extend 
v' to a linear functional fj, on all of C(X) satisfying ^(/) < ||/||. Since 
fj,(l) = 1, H^ll = 1. Write for the positive and negative parts of fi. 

Then 

Ha*+II - IIa»-II = = i = ll^ll = + IIm-II- 

So [i- = 0, and \x is a probability measure. Since \i agrees with Haar measure 
on C{K(X,T)), it lies in Pi(X,T). 

Proposition 2.14. Factors of twisting (weakly twisting) systems are also 
twisting (weakly twisting respectively.) 

Proof. Notice that if a system has a fixed point which is a universal attractor, 
then the same can be said of every factor of that system. Also notice that 
a factor of a uniquely ergodic system is uniquely ergodic. The result now 
follows from the functoriality of Pi and Proposition 12.41 □ 

The next proposition is probably useless but the proof is too perversely 
entertaining to omit. 

Proposition 2.15. If (X,T) is weakly twisting then the maximal equicon- 
tinuous factor of (Pi,T*) is trivial. If (X,T) is twisting then (Pi,T + ) is 
twisting. If (X, T) is weakly twisting but not twisting then the same holds 
for (PuT,). 
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Proof. By Cor pilar v 1 2. 8 \ for any \x G Pi(X), T^fi — > m along some sequence. 
The same is true for T™(7T/u). Without loss of generality, K{P{) is an iso- 
metric system. It follows that tt(p) = 7r(m). Therefore K(P\) is a single 
point. 

By the preceding remark K(Pi,T+) is minimal, which is necessary to 
make sense of Pi(Pi,T*). We want to show that Pi(P\,T+) is uniquely er- 
godic. Let 9 be an invariant measure on Pi(Pi,T+). This is an element of 
Pi(Pi(Pi, T*), T**)). The barrycenter 9' of 9 is an element of P(Pi) given 
by 9' := Jp 1 (p 1 j<a r]d9(rj). Invariance of 9 gives us invariance of 9'. Therefore 
&' = &m- But the only way to take a convex combination of measures and 
get a J-measure is if the combination is degenerate. In other words 9 = 5$ m . 
This proves (Pi(Pi, T*), T**) is uniquely ergodic. Equivalently, (Pi,T+) is 
weakly twisting. 

If we additionally assume that (X, T) is not twisting, then there is some 
measure fj, £ Pi which does not attract to m. It follows that 5^ does not 
attract to 5 m . So Pi is not twisting. 

Now assume (X, T) is twisting. Then it is also weakly twisting and 
once again, we can define Pi(P\,T+). Fix n E Pi(Pi,T*). For any e > 
and for any neighborhood U of m in Pi there exists N such that for all 
n > N we have T^rjQJ) > 1 — e. It follows that any weak* limit 9 of T^rj 
satisfies 9({m}) = 1. In other words, 9 = 5 m . This proves that 5 m is the 
unique attracting fixed point in (Pi (Pi, T*), T++). In other words, (P\,T+) is 
twisting. 

□ 

3 Minimal rotations on nilmanifolds 

In this section we derive Theorems 1 1 . 2 1 and 1 1 . 3 1 as easy corollaries of Theorem 

Let G be a connected, simply connected, nilpotent Lie group with Lie 
algebra q. Let T be a lattice in G. It is well known that the exponential map 
provides a homeomorphism between G and q (see [7j.) Write G{ for the lower 
central series of G and let I be minimum with Gi = 0. Write = G/GiT. 
Fix aeG such that T(gT) = ugT is a minimal rotation on X. 

Theorem 3.1. Let n be a probability measure on X which projects to Haar 
measure on Xi_\. Then T™(i converges in the weak* topology to Haar mea- 
sure. 

We will give two proofs of this theorem. The second proof is shorter and 
relies on the fact that rotations on nilmanifolds have countable Lebesgue 
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spectrum on the orthocomplement of L 2 {X\) (see Green's article: chapter 
5 in pp.) The first proof is geometric in nature and shows how shearing 
causes invariance. The first proof is longer, but it has two advantages: it 
stands alone and, more importantly, the method can be adapted to other 
situations. In particular, the geometric method of the first proof applies to 
systems with far less algebraic structure, like the skew products discussed 
in section 4. 

Before proving the theorem, we define a metric inducing the weak* topol- 
ogy which makes some observations easier. Strictly speaking, this is unnec- 
essary. However, it allows us to avoid explicitly mentioning test functions. 
The author finds this notation convenient and hopes the reader will as well. 

Given /j, € C(X)* define the norm. 



The reader should be able to easily verify the triangle inequality. This turns 
C(X)* into a (usually) not complete normed linear space with the following 
nice property: let B C C(X)* be bounded in operator norm. Then the 
topology induced on B by || • ||* is the weak* topology. As we will see, 
simple geometric properties of maps on X often translate into equally simple 
properties of the induced maps on C(X)*. 

To prove that || • ||* induces the correct topology, recall that B is metriz- 
able in the weak* topology (really all the work is hidden in this fact.) So it 
suffices to prove that a sequence converges under one topology if and only if 
it converges under the other. Suppose [i n is a sequence in B. If fi n converges 
to fi in the weak* topology, then J fdfi n converges uniformly to f fdfi for all 
/ in a compact subset of C(X). Here we are using the fact that / i— >■ J fdu 
is itself 1-Lipschitz. But the set of all 1-Lipschitz / € C(X), \ f\ < 1 is com- 
pact. So this tells us that ||/x n — HI* ~~ * 0- Conversely, suppose ||/%— H|* — ► 0. 
Then for any Lipschitz function / we have | J fdfi n — j fdfi\ — > 0. But the 
Lipschitz functions are uniformly dense in the continuous functions. So the 
same holds for any / € C(X). 

One obvious fact about || • ||* is that isometries of X induces isometries on 
(C(X)* , || • (I*). Another observation we will need is that if : X — > X moves 
each point by at most e, then — HI* < e IIHI (where ||HI denotes the 

usual operator norm on C(X)*.) This metric also has nice properties with 
respect to convex combinations : if we write fi = J [i x dv{x) then 



H|* := sup{ / fdfi : f G C(X), \f\ < 1,/ 1-Lipschitz }. 
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Suppose X = \J i d is a partition of X into sets of positive measure. Then 
we can define the conditional measures fx x = //(Cj) -1 ^^, when x 6 Cj. 
Letting v = \i and applying the principle above yields 

i i 

We shall also need to define some metrics on groups and their quotients. 
Let do be a right-invariant metric on G. One can construct this by choosing 
an inner product on the Lie algebra of G and then transporting this via 
right-translation to a Riemannian metric on G. Then one defines dc(g,h) 
in the usual way by taking the infemum of lengths of differentiable curves 
connecting g to h. This allows us to define a metric on X by 

dx(aT,bT) = inf da^a^y, b). 

This metric has the property that dx(aT, bT) is equal to the size of the 
smallest g such that gaT = bT, where by "size" we mean dc(g, 1). 

Finally, recall the following results of Malcev [7] . We can choose a canon- 
ical basis for T. This is a collection 71, . . . , 7^ E T such that 

1 . every element of G can be written uniquely as a product • • • 7^ fc 
(where the are real numbers.) 

2. for each i the set of all elements of the form 7?* • • • 7^ fe is a normal 
subgroup G(j) of G 

3. for each i, is isomorphic to R 

4. the sequence is a refinement of Gj (that is, Gj is a subsequence 
oft?*.) 

Both proofs of Theorem 13.11 begin the same way. 

Proof of Theorem 3.1. Choose a Malcev basis 71, 72, • • • , 7fc as explained 
above and suppose 7fe'+i, • • • ,7fc are those coordinates lying in Gi—\. Let F 
be the standard fundamental region for the action of G on X. That is, 

F = { 7l ei 7 f ■■■ 7 ^:0<e i <l}. 

Choose a finite partition of [0, l) fc ~ fe ' into sets C(,C*2, . . . having small 
diameter. For each i, let Cj be the set of all g = 7i 1 7 2 J2 ■ ■ • % k € i 7 with 
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(efc'+i, • • • ,&k) ^ C[. This is a partition of F (or X if we wish.) Arbitrarily 
choose = (fk'+i, ■ ■ ■ , fk) ^ G'i an d define a map : Cj — > Ci by 

ei eo eu . ei e? e t./ fk f +i fk 

7i 1 7 2 2 • • • Ik 7i 1 7 2 2 ■ • • 7*? 7 fc /+i • • • 7* ■ 

Fix e > 0. If the C£ are sufficiently small than each Cj has diameter at 
most e (in the right invariant metric on G.) So each $j moves points by at 
most e. 

Let /Uj be the pushforward of n\c t under <J?j. This is (probably) not a 
probability measure. However ^7 is a probability measure. In fact, we 
will see (if we think of this measure on X) it closely approximates \i in a 
way that's invariant under multiplication by u. 

Define &(g) = &i(g) when g € Cj. Then $ moves each point by at most 
e and only in the isometric direction. So T n< 3? and T n are point-wise e-close. 
Therefore \\T^{ji) - !?(£, = 11^) - T"**(m)|U < e. 

As explained above, if we project the canonical coordinates 71,72,- •• 
to G/G1-1 then, except for 7a/+i, • • • , 7fc which vanish, we get canonical 
coordinates for the lattice TGi-i in G/Gi-%. Let Qi be the image of <£j. 
Each Qi is a Euclidean cube in F, which is bijectively mapped by ir to the 
standard fundamental domain F\—\ in G/Gi-\ for Xi_i with respect to the 
quotient coordinates. Haar Measure mi-i on Xi-i is the same as Lebesgue 
measure (with respect to canonical coordinates.) So, if we equip the cube Qi 
with Lebesgue measure Aj (of the appropriate dimension and with respect 
to canonical coordinates in G,) we see that / K\Q i : Qi — > is measure 
preserving. Also notice that n\p : F — > F\-\ preserves \i by assumption and 
hence also preserves ^7 /ij (i.e. these measure project to Haar measure on 
F1-1.) ^' 

Fix j and let Nj C Qj be a Aj-null set. Let N = ir(Nj) and let iVj = 
(ttIqJ-HA). Then 

= Xj(Nj) = m z _i(iV) = (tt^^)(A) = ^^(vr- 1 A) = ^ A t i (iV i ). 

It follows that [ij(Nj) = 0. We have proven that each /ij is absolutely con- 
tinuous with respect to Aj . The Radon-Nikodym Theorem allows us to write 
dfj,i = f[d\i where f[ is a measurable function on Qi. Note that < f[ < 1 
Ai-almost everywhere. 

The proof now continues in two ways. 

First proof of Theorem HOI The commutators [v,h] = z where v £ G/_2 
and h € G generate [Gi-2,G] = Gi-\. We will show that when n is large, 
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T™(i is nearly invariant under the action of z. By finite dimensionality, the 
same is then true for the action of any element of 

Choose a continuous function fi on Qi,\fi\ < 1 which agrees with f- on 
a set Si C Qi. Make this choice so that Ylif^ii^i) > 1 — e- Then an d 
f agree on the set (J^ Si, and both give it measure at least 1 — e. 

Write dv = £\ /jdAj. Then 

||T> - T>|U < ||T> - T?C£ IM) \U + K(£ W-E < 2e - 

We want to show that T"/x is nearly invariant under z when n is large. 
The calculation above shows us that it suffices to show T£v is nearly invari- 
ant under z. The advantage of using v instead of fi is that it is supported on 
geometrically nice pieces Qi of F on which it is given by continuous density 
functions. It is easier to understand what happens to such a measure when 
perturbed. 

Write u = 7^ • • • j^d where 71, • • • , 7 a are the elements of the Malcev 
basis of r lying in G\G\ and g £ G\. Let v € G/_2 be given by v = r/f 1 • • • 
where 771, . . . , % are the elements of the Malcev basis of T lying in G;_2- Then 

[v, u n ] = [4 1 ■ ■ ■ vt (71 1 • • • l e a9) n ] = I U 'U-'; 1 -'- 

\ i,3 

The [r/j, 7$] may not be linearly independent in In fact, some must van- 
ish. But, since the one parameter subgroups through the 71, . . . , 7 a generate 
G and since the one parameter subgroups through the rjj generate G1-2, we 
know the one parameter subgroups through the [??j,7i] generate G/_i. In 
other words, the map 

M ba 3(t j , i )^l[[v j ,ii} tj * i eG l - 1 

is surjective. It induces a map M. ba /Z ba — » Gj-i/IVi (where is the last 
nonzero term in the lower central series for T.) Now let us choose the fj 
such that {fjei : j, i] together with 1 forms a collection of numbers which 
is linearly independent over Q. To do this, choose /1 to be transcendental 
over 1, ei, . . . , e a . Then choose /2 to be transcendental over 1, e±, . . . , e a , f±, 
etc. Then, a Q-linear relationship between the e^/j would allow us to solve 
for fj (with j maximum) in terms of 1, e±, . . . , e a , fi, . . . , fj-i, which would 
contradict the choice of fj. 
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It follows that the backwards orbit {n(fjei)j^ : n G — N} is dense in 
M. ba /Z ba . So, it has dense image in G/_i/Tz_i which, in turn, surjects onto 
G;_i/(rnG;_i) where it has dense image as well. We have chosen an element 
v G G/_2 such that {[t>,-u ra ] : n < 0} has dense image in G/_i/(r n G/_i). 

The commutator restricts to a continuous map G;_2 xG4 G;_i which 
induces a map 

Gi-2/Gi-i x G/Gi G_! 

of real vector spaces. It follows from continuity and the identity [xy, z] = 
[x, z] y [x, y], that this map is bilinear. 

Let A > and observe that [v x ,u x n ] = [v,u n ]. This tells us that there 
exists some N < such that for all n < N we can choose h G G^_2 and 
7 G T n G/_i such that 1) < 5 and do([h,u n ]j, z) < 5. We now use 

centrality of 7 and right invariance of the metric to get 

d G (zu- r \u- n h 7 ) = d G (z,n- n /iu n 7 ) 

< d G (z, [h, u n } 7 ) + d G ([h, u n } 7 , u- n hu n 7 ) 

< 25. 

For any g G G we have 

25 > dx(2n~ n 5 r,n- n /i7 5 r) = dx(^" n 5 r, n~ n /i 5 r). 

In other words, the actions of zu~ n and u~ n h on X are nearly the same. 

To summarize, we have shown that, for all 5 > there exists a large 
integer N' (equal to — N) such that for all n > N' there is some h G G/_2 
with da(h, 1) < 5 and dx(zu n .x, u n h.x) < 25 for all 

Now we study the push-forward of z/ under h. Given g G G we can use 
the canonical coordinates to write 

2 = 7i ei 72 e2 ---7^'C and hg = 7? 7? ■ ■ ■ 

where £ G-i- Write «(<?) = C(C')^ 1 - Write a for e divided by the total 
number of cells Cj. Since F is compact, by requiring fa be sufficiently close 1 
(i.e. by letting 5 be sufficiently small in the argument above) we can achieve 
d(n(g), 1) < a for all g G F. Also require 5 < a. Let v' be the push-forward 
of v under g 1— > /ig, and let z/' be the push-forward of v under g 1— > K,(g)hg. 
Since these two maps are point-wise a-close and differ only in the isometric 
direction, as in the arguments above, we have ||T"(z/) — T™(v") ||* < a for 
all n. 

Let Ri := {K(g)hg : g G Qi} D Qi. Both h and are within a of 1 
by assumption, so d(n{g)hg,g) < 2a. Recall that Qi is a cube with respect 
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to canonical coordinates that was created by fixing all coordinates lying in 
Gi-\. Loosely speaking, Qi is a cube lying in a particular plane in G. By 
the choice of n(g) we know that for g G Qi,K(g)hg lies in the same plane. 
Since Qi has side length 1, it follows that Ri contains a cube of the same 
dimension, centered in Qi, and having side length 1 — 4a. Since Qi has 
dimension k' we conclude Xi(Ri) > (1 — 4a) fc . Similarly 

Xi({K(g)hg :geQi}U Qi) < (1 + Aaf . 

Therefore 

Xi({K(g)hg : g € Qi}AQ t ) < (1 + 4a) fc ' - (1 - 4a) fc ' =: p. 

By construction, z/'li^ is absolutely continuous with respect to Aj and 
has density function f"(g) '■= fi{n{g)hg). The functions fi form an equi- 
continuous family. So, by requiring n(g) be sufficiently close 1 we may 
assume \fi(g) — fi(g)\ < a for all i and all g € Ri. We now have 

||T> - T>"|U = || Y^(fi o T- n )dT?\ t - Y^tf" r-^dT^A^IU 

<p + ^||/,or-"-/for-||A,(^) 

i 

<P + y^q = P + g. 

j 

Now we overestimate ||T™/i — z.T™/i||^ by collecting results and applying 
the triangle inequality to the sequence of measures T" n,T™v,T™v" \T™v' = 
z.T^p. This calculation yields 2e + (p + e) + a. which tends to as e tends 
to 0. 

This implies that any weak* limit 9 of T™// is invariant under Gi-±. Write 
7r : G/Y — > G/Gi-iY and let A be the inverse image of the Borel cr-algebra 
on Xi-i under ir (this is the algebra of invariant sets.) We then have 
that for ^-almost every x £ X the conditional measure 6% is equal to Haar 
measure X^x on the torus 7r _1 7r(x). Write m;_i for Haar measure on X\-\. 
By assumption ir+p = m/_i, so ir+9 = as well. Therefore 

9=1 9^d9(x) = / X y dirJ(y) = / Xydm^y) = m. 

□ 
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Second proof of Theorem \3.1[ Choose a continuous density function f3 on 
such that j3 > 0, J x f3dmc l _ 1 = 1, and /3 = outside an e-neighborhood 
of 1. Define fi G C(X) by fi(gzT) = f-(g)f3(z) for g € Cj and 2 in the e-ball 
around 1 in Gr;_i and let fi = elsewhere. When e is smaller than the 
injectivity radius of X this is well defined. Notice that fi € L 2 (X). 

Since dvi := f%dm can be perturbed in the central directions to yield 

f[d\i, the two do not deviate from one another under the application of 

G 

T*. More formally, the map zg i— > g which collapses B £ _1 (l)Cj to C\ takes 
dvi = fidm to Hi = f[d\ and moves each point by at most e. Therefore, 

\\T£m - T^VilU = - = Wf'idXi - fidmW* 
||ir(E^)-r* n (^^)lk<£- Finally, 

j i 

||T>-r^(^)|U<2 £ . 



By construction z/ := ]T\ v i is absolutely continuous with respect to m, so 
we can write dv = fdm. Since |/j| < 1, / is bounded, hence in L 2 (X). Being 
the density function of a probability measure, J fdm = 1. Let (p G C(X) be 
1-Lipschitz with sup \ip\ < 1. Then 

lim / tpdT™v= lim / (poT n fdm= lim {tpoT n J) 

n— >oo / n— >oo / ra— »oo 

<^dm / /dm = / ipdm. 



where the third equality comes from the fact that (X,T) has countable 
Lebesgue spectrum on the orthocomplement of L 2 (Xi). 
Finally, let L be any limit value of j (pdT™fi- Then 



L - 


j (fdm 


< 


L — lim f <pdTy 


+ 


lim / 






n—¥oo J 




71— >0O / 



ipdm 



< lim sup ~ 2?f||* + 

71— >0O 

= 2e. 



Since £ was arbitrary, It follows that lirn^—^oo J (pdT™fi = j tpdm. Since <p € 
C(X) was arbitrary, we have shown limjj^oo T^jj, = m, as desired. □ 
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Corollary 3.2. Let fi be a probability measure on X whose projection onto 
Xi_i is absolutely continuous with respect to Haar measure. Then any weak* 
limit ofT^fj, is invariant under Gi~\. 

Proof. This follows from the proof of Theorem 13.11 □ 

Corollary 3.3. Suppose G is 2-step nilpotent group and T is a minimal 
nilrotation on a compact nilmanifold X = G/T. Then (X,T) is twisting. 

Proof. Apply Theorem 13.11 □ 

Corollary 3.4. Minimal nilrotations are weakly twisting. 

Proof. Let Pj be the set of probability measures on X which project to Haar 
measure on JQ. Then Pi D Pi D • • • D Pi = {m}. By Theorem 13.11 we know 
that for {j, G P{,i > 1, any subsequential limit of T™fi lies in the compact 
set -Pj+i. Equivalently d(T™fi, Pi+i) — > 0. Let 6 be an invariant probability 
measure on (Pi,T*). For any e > there exists n such that 

1 - e < 9({» G P x : d(T>, P 2 ) < e}) = 9({fi G Fx : d(p, P 2 ) < e}) 

where the equality follows from invariance of 6 and P 2 . We have 

9{P 2 ) = 6{C\{ii G Pi : d(fi,P 2 ) < e}) > liml -e= 1. 

£>0 

Repeating this argument inductively yields 6 (Pi) = 1 for all i. In particular 



e(P t ) = 9({m}) = 1. That is, 9 = S m . □ 
Proof of Theorem li.gl Apply Corollary 12.81 to the conclusion of Corollary 

m □ 

Proof of Theorem \1.3[ This is just a rephrasing of Corollary 13.31 □ 



Question 3.5. Can Corollary \3.4\ be strengthened? Are all minimal nilro- 
tations twisting? 

4 Skew products 

By a skew product we mean a system with space XxY and transformation of 
the form T(x, y) = (Tq(x), f(x, y)), where To : X — > X is a homeomorphism 
and / : X x Y — > Y is continuous. We say such a system has base (X, Tq) 
and fiber Y. 



23 



Furstenberg proved in [3] that a large class of skew products are uniquely 
ergodic. He also discusses systems derived from an irrational circle rotation 
by repeatedly taking skew products. In particular, he proves transformations 
on T d of the form 

T(x 1 ,...,x d ) = (xi + a,x 2 + fi(xi), . . . ,x d + fd-iixi, . . . ,x d -x)), 

are uniquely ergodic when the fi are lipschitz and homotopically non-trivial. 
We strengthen his result by showing that such systems are weakly-twisting, 
and when d = 2 we show they are twisting. 

The goal of this section is to prove Theorems 11.41 and 11.51 Just as in the 
case of nilrotations, we will derive these theorems as corollaries of another 
theorem which tells us that when one takes weak* limits of T™fi one gets 
measures with more invariance properties. 

Theorem 4.1. Let X = T d and suppose T : X — > X is of the form 

T(x 1 ,...,x d ) = (xi +a,x 2 + fi(xi),...,x d + f d ^ 1 (x 1 ,...,x d ^ 1 )), 

where a is irrational, fk is Lipschitz, and each fk(%i, ■ ■ ■ ,%k-i, •) : T — > T 
is homotopically non-trivial. If the projection of [i onto the first d — 1 coor- 
dinates is absolutely continuous with respect to Haar measure on T d ~ l then 
any weak* limit ofT^jj, is invariant under rotation in the last coordinate. 

Notice that the statement that Xk i-> fk( x l, ■ ■ ■ , x k) is homotopically 
non-trivial is independent of the choice of x%, . . . , x^-i since all such choices 
lead to nomotopic loops. 

Corollary 4.2. Any Lipschitz iterated skew product system (X,T) (as in 
Theorem \4-l\ ) is weakly twisting. 

Proof. The proof is the same as that of Corollary 13.41 We stratify P\{X) 
into sets Pi consisting of measures which, when projected onto the first i 
coordinates, give Lebesgue measure. By induction we use Theorem 14.11 to 
conclude that if 9 is an invariant measure on P\{X) then 6(Pi) = 1 for all i. 
In particular 9{Pd) = 9({m}) = 1, so 6 = 5 m . □ 

Corollary 4.3. Any Lipschitz skew product (x,y) (x + a,y + f(x)) on 
T 2 with a irrational and f homotopically non-trivial is twisting. 

Proof. This follows immediately from Theorem 14.11 and the definition of 
twisting. □ 
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As in the nilrotation case, Theorems 1 1 . 41 and 1 1 . 5 1 are trivial observations 
given Corollaries 14.21 and 14.31 and the results of Section [2 We leave this to 
the reader. 

Before setting out to prove Theorem l4.14 we first need an ergodic theorem 
for functions taking values in the upper unipotent group U (Theorem 11.61 ) 
For this we require lemmata. 

Lemma 4.4. Suppose u € U and A = X(j — i) is as in Theorem \1.6\ Then 
lirn^ \9 1 / n (u n ) iy j - \ui t i + iU i+ i ji+ 2 ■ ■ ■ = 0, 

for all 

Proof. We will prove by induction on j — i that when j — i > 0, {u n )ij is a 
polynomial of degree at most j — i in n with the coefficient on n 3 ~ % equal to 
AfXi j i+in,, + i > j + 2 • • • u j-l,j (from which the claim immediately follows.) When 
j — i = we have an empty product and the result is obvious. For larger 

3 ~ i 

j 

(U n )ij = (uu n -%- = Y U ^{u n ~ 1 )k,j 
k=i 
j 

= (^ n_1 )i,i + Yl u iM un ~ 1 )kd 

k=i+l 

j j 
= ( un ~ 2 )i,j + Y u iA un ~ 2 )k,j + Y u hk{u n ~ V )k,j 



k=i+l k=i+l 



n-1 j 

= Y^Z U i,k( um )k,j 
m=0 k=i+l 

n-1 / j \ 

= Y I u i,i+l( um )i+lJ + Y U i,k( um )k,j j • 
m=0 V k=i+2 J 

Write P(m) for expression in parentheses. By our inductive hypothesis, the 
inner sum is a polynomial (in m) of degree strictly smaller than j — i—1. Also 
by our inductive hypothesis, Ui^ + i(u m )i + ij is a polynomial of degree at most 
j — i — 1 with coefficient on n-? -4-1 equal to X(J — i—VjUij+i • • • 
So P(m) inherits this property. It follows that 

n-1 
m=0 
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is a polynomial of degree at most j — i with coefficient on n J 1 equal to 
X(j - i)u i} i + iUi + i >i+ 2 ■ ■ ■ Uj-xj. □ 

Lemma 4.5. For any e > and M > there exists 5 > such that the 
following holds: Suppose {u, u n : n = 1, 2, . . . } is a bounded subset of U with 
all entries on the first super- diagonal bounded by M. Also suppose that for 
any n, all super- diagonal entries {u n )i t i + i differ from u^i+i by at most 5, 
then 

limsup \0x/ n (uxU2 ■ ■ ■ u n )ij - Ox/ n (u n )i,j\ < e. 
n— >oo 

for all i,j. 

Proof. By Lemma 14.41 we may as well assume Ujj = when j — i > 2 since 
this does not change the asymptotic behavior of the entries of u n . Notice 
that the conclusion we are trying to derive is invariant under conjugation by 
a diagonal matrix. So we may conjugate everything by some diagonal matrix 
having only 1,-1 on the diagonal to ensure that u has non- negative entries 
on the first super-diagonal. Let M'/2 be the universal upper bound on the 
entries of u, u n . Write u n = u + 5 n where S n is strictly upper triangular, with 
|(<5n)i,i+i| < 5 and |(<5 n )ij| < M'. Now consider the matrix 

u\u 2 ■ ■ ■ u n - u 11 = (u + 5i)(u + <5 2 ) • • • (u + 5 n ) - u n . 

Multiplying out the right side of this equation yields another instance of u n 
which cancels with the — u n to leave a sum of products of u with the <5j. The 
entries of v can only increase if we replace 5^ by |<5fc| (where \b~k\ij '■= l(^fc)ij'l-) 
Let A be the strictly upper triangular matrix having ua+x + d~ on the first 
super-diagonal and M' on all other super-diagonals. This matrix has entries 
no smaller than the entries of any \5i\. Therefore 

( Ul U 2 ■■■Un- U n )ij < ((u + \Sx\) ■ ■ ■ (tt + |<5n|) - U n \j < ((« + A) n - U n )ij. 
Applying Lemma 14.41 again yields 

limsup —r—{uiu 2 ---Un- u n )i :j < X Y\_( u k,k+i + 5) - A u k ,k+i- 

k=i k=% 

A symmetric calculation gives the same estimate for the other difference 
u n — u\u 2 ■ ■ ■ u n . Therefore 
1 

n— >oo Tip 



\xmsu-p-^\{uxu 2 ---u n -u n \j\ < X(M + — \M d ~ l , 



which can be made as small as we like by requiring 5 to be sufficiently 
small. □ 



26 



In order to simplify notation, we use the langauge of cocyles. By a 
cocycle on an invertible dynamical system taking values in a group G, we 
mean a function C : X x Z — > G with the property that C(x,n + m) = 
C(T m x,n)C(x,m). One can easily extend this definition by replacing Z 
with any group or semi-group. With more complicated groups acting, the 
structure of a cocycle may be somewhat constrained. With that integers 
acting (as in our definition), any map / : X — )■ G yields a cocycle: when 
n > we let 

C(x, n) = f(T n ~ l x) ■ ■ ■ f(Tx)f(x) and C(x, -n) = C(x, n)' 1 . 

One immediately sees that all cocycles arise in this way by taking f(x) = 
C(x,l). 

Proof of Theorem \l.b\ Assume m is ergodic. Unpacking the definition of 9t, 
we see that the goal is to demonstrate the validity of the limit 

1 f 

lim — i—i C ( x ' n )i,j = A TT / f( x )k,k+idm. 

n— >oo W - LJ - J V 

k=i 

Fix e > 0. Let M = sup^ and let 5 be given by Lemma I3~5l 

Choose N sufficiently large that 



1 N ^ f 

/( T " x )m+i - / fi,i+idm 
N t^o Jx 



< 5. 



for all i and for all x in a set E with m(E) > 1 — e 2 . 
Notice that for any 



7V-1 



^ Y, f(T n x) hl+1 = ^C(x,N) i>i+1 = (9 l/N C(x,N)) iyi+l 

n=0 

With this in mind, let v k = v k {x) = 1 / N C(T kN x, N) and let u G U be the 
matrix with ones on the diagonal, u^j+i = J fi,i+idm, and zeros elsewhere. 

Although we are assuming (X,B,m,T) is ergodic, {X,B,m,T N ) may 
not be. Let m = J fj, x dm(x) be the associated ergodic decomposition. We 
also have the decomposition m = J N~ 1 (fi x + ht x + ■■• ^N-x^)dm{x) . But 

T*(N~ l (n x + ■■■/i T w-i a .)) -iV" 1 ^ + • • • fi T N-i x ) = N~ 1 (ij, t n x - Hx), 

which is zero m-almost everywhere. We have written the T-ergodic measure 
m as a convex combination of T-invariant measures, so the decomposition 
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must be degenerate. That is m = N^ 1 (fi x + ht x + ■■■ ^t n - 1 x) "i-almost 
everywhere. Let xq,x be two points for which this happens. Any set which 
has positive measure for \i x also has positive measure for m = N~ 1 (fi xo + 
• • • fJ-T N ~ 1 x )^ an d so must have positive measure for some fJ-T l x - But, distinct 
crgodic measures must be mutually singular. So, it must be that \x x = /J>t 1 x 
for some I. 

Since e 2 > m(X \ E) = J [i x (X \ E)dm(x), there is a set Y C X with 
m(Y) > 1 — e such that for x £ Y, fi x (X \ E) < e (by the Chebychev 
inequality.) By removing at most an m-null set from Y we may assume 
that for every x G Y there exists < I < N such that \i x = Ht 1 x ( trns 
follows from the previous paragraph.) Fix x £ Y. By the point-wise ergodic 
theorem, the set S := {k : T kN x <£ E} has density d(S) = (i T i XQ (X\E) < e. 
Therefore, when K is sufficiently large, we know that \S n [0, K)\ < eK. 

Define a new sequence by letting = when k S (when T kN x G 
E) and let Uk = u otherwise. In the expression 



rewrite Vk as u — (u — Vk) for each k £ S. Multiplying out the resulting 
expression yields u$u\ ■ ■ ■ uk-i together with many summands which we 
will study presently. 

Each of these summands is a product of matrices and each matrix is 
either a Vk or it is of the form u — Vk- Notice that u — v k is a strictly upper 
triangular matrix. Let L = 2sup y i j \0 1 /^fC(y, N)ij\. Let A G U be the 
matrix with Aij = L when i < j and let B = A — 1. Then 



So we can overestimate the size of the entries of a summand by replacing 
every Vk by A and every u — Vk by B. Conveniently, A and B commute and 



There is a constant V depending only on L and d such that for all n, every 
entry of B n is bounded by L' . So, 



9y N C(x, KN) = vqVi ■ ■ ■ v K -i, 



\{vk)i,j\ < Aij and \(u - v k )ij\ < B itj . 



B d = 



0. So 




n=l 



(A K ~ n B n )ij < Y,(A K - n UL'. 



k=i 
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Now we apply Lemma 14.41 (the proof if not the statement) to see that 
(A K ~ n )i t j t is bounded by a polynomial (in K) of degree at most k — i with 
coefficients depending only on / and d. Adding these up, we see that the 
expression above is bounded by a polynomial pij tn (K) of degree at most 
j — n — i, again, with coefficients only depending on / and d. Finally, we 
have 



\(v ■ ■ -V K -l)i,j - (uo ■ ■ -UK-l)i,j\ < ^2 \ ) 

n=l V n / 



which is a polynomial of degree 

maxjdeg ( \ + <legpij^ n (K) : 1 < n < d — 1} < n + (j — n — i) = j — i, 

with e dividing the leading coefficient. Write ep(K) for this polynomial. 

Notice that the matrix u and the sequence satisfy the hypotheses of 
Lemma 14.51 Therefore 



lim sup 

K— voo 



1 1 ' ,KN 



(u ui ■ ■ ■ u K -i)ij - ^ AAj ._. (u 



K3-^ L,hJ (KN) 



< 



1 



limsup— ^ Uu U! ■ ■■u K -i)i,j - (u K )i,j \ + 



K3- iy ' hJ (KN) 



3-1 



lhnsup j^r-r \ {u ui ■ ■■u K -i)i,j - {u K )i,j \ + < e, 



K^too 



where the second summand in the middle of this calculation vanishes be- 
cause, by Lemma 14.41 it is a difference of two expressions having the same 
limit. 

Let M' be larger than the absolute value of any matrix entry of C(y,l) 
or u l where I ranges over [0, N — 1} and y ranges over X. By Lemmas 14.41 
and 14. 5[ we can find a polynomial q(n) of degree at most j — i — 1 which 
gives an upper bound for \C(x, n)i'j'\ for all n and all j' — i' < j — i. and 
also for \uf, -,\. It now follows from simply writing out the multiplication 

C(x,KN + 1) = C{T KN x,l)C{x,KN), that 
\C{x,KN + l)i d -C(x,KN)i d \ < dq(KN)M' . 

The same argument gives us \u^ NJrl — uf-^\ < dq(KN)M' . Finally, for K 
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sufficiently large, we have 



C(x,KN + l) M -(u KN+l ) 



< 



\C(x,KN + l)u - C(x,KN)ij\ + \C(x,KN)u - N^nom 



■U K , 



1,3 



+ 



u 



KN+l\ 



n,3 



+ \N3-\u u l ---u K \ 3 -(u KN ) l j 
< dq(KN)M' + N^^epiK) + e{KNy i + dq(KN)M' 
= e [N'-ipiK) + (KNy-t] + 2dq(KN)M' . 

Recall that deg q < j — i and degp < j — i. So, we have a polynomial in K of 
degree at most j — i with eN 3 ~ 1 dividing the coefficient on K J ~ l . Therefore, 
if we divide this whole expression by (KN + and take a limit, we get 



1 



limsup 7v,,. 

KN+l4oo (KN + 1)3- 



C{x,KN + l)ij 



,KN+l\ 



11,3 



< ce, 



where c is some constant depending only on / and d. Now we apply Lemma 
and conclude that for all x in a set Y of measure at least 1 — e, 



lim sup 

n— >oo 



1 f 

—t—tC(x, n)ij — A JJ / f(x) k ,k 



.\dm 



< ce. 



Since e > was arbitrary, in fact, the limit is zero almost everywhere. This 
completes the proof of the ergodic case of the Theorem. 

If m is not ergodic then let m = j m x dm(x) be its ergodic decomposition. 
Let X' be the set of all x € X for which 9i/ n C(x, n) converges. Regrettably, 
we must show this set is measurable. Chose a countable dense set {w^ : k G 
N} in U. Then the measurable set 

oo oo 

X\k,e) := |J p| {x G X : 6 1/n C(x,n) G B e (w k )} 

N=l n=N 

consists of all x G X for which 9±/ n C(x,n) eventually always lies in the 
e-ball around Wk- Now 

oo oo 

x ' = n u v») 

n=l k=0 



is measurable as claimed. 
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By the ergodic case, m x {X') = 1, so 

m(X') = J m x (X')dm(x) = J ldm = 1, 

Finally, for almost every x, we have computed the exact value of f*(x) 
and it depends only on m x . Since rriTx = m x for m-almost every x, we have 
f* o T = f* almost everywhere. □ 



The proof of Theorem 11.71 is easier and similar to that of Theorem 11.61 
So, we only sketch it here. 

Proof of Theorem \1.7\ Fix e > 0. Let M = sup iy |#i/_/vC(2/, iV)t,t+i|) an d 
choose 5 as in Lemma 14.51 Choose N sufficiently large that 



N-l 

- £ C(T^, 1) M+1 - / C(y, l) iti+1 dm(y) 

n=0 ^ X 



< 5. 



for all i and for all x € X. Let cc be any point at all, and let ut = 
9i/ N C(T k x, N). Let u be as in the proof of Theorem 11.61 Now we can 
skip most of the difficulties and apply Lemma 1431 to u and Use the same 
argument as in Theorem 11.61 to finish the proof. □ 

A short comment about coordinates: In R or in M. d , there is only one 
reasonable way to dilate: one must multiply by a scalar. But in U this is 
not the case. The one parameter family 6t of dilations we used is in no way 
canonical. For instance, we could define B' t (u) = v9t(t > uv)v~^ , for some 
v 6 U. In other words, we could conjugate 9t by an inner automorphism 
of U. This results in nothing but a change of coordinates. So, obviously, 
Theorem 11.61 works just as well with 9' in the different coordinate system. 
Assuming we have an ergodic system we get 

3-1 , 

hm 9' l/n {f(T n - l x)...f(T 2 x)f{Tx)f(x)) h3 =\\\ / (y- 1 fv) k , k+1 dm. 



Proof of Theorem \4-l\ The strategy of this proof is similar to the geometric 
proof of Theorem 13. 11 However, in place of the full G action on X, which is 
available to us in that Theorem, we employ Theorem 11.61 

Suppose ii projects to Haar measure on T^" 1 . Fix e > and define a 
partition {d}i of T d ~ [0, l) d by (xi, . . . , x d ) £ Ci if te < x d < (i + l)e. 
Let Qi = {(xi, . . . ,Xd-i,ie) : xi € 1} C Cj and let $j : Q — >■ Qi be the 
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projection given by $i(xi, . . . , xj-i, Xd) = (x±, . . . , Xd-i,ie). Write fa for the 
pushforward of fa\c i under and let // = fa. 

Notice that [i and \j! have the same projection onto the first [d — 1)- 
coordinates. Assemble the maps $j into one map $ by setting &\c'i = <&i- 
Then (j! = Since T ra <3? and T n are point-wise e-close, ||T"yu— T™fj,'\\+. < e. 

Write A for Lebesgue measure on T d_1 and Aj for Lebesgue measure on 
the (d — l)-dimensional tori Qi. Fix j and suppose Nj C Qj is a Aj-null set. 
Let N = Tr(Nj) and let iVj = (7r|Qj) _1 (A r ) where 7r is projection onto the 
first (d — l)-coordinates. Since 7r*(/f) is absolutely continuous with respect 
to Lebesgue measure, and since X(N) = 0, we must have 

= w(N) = J> 4 )(iV) = Y^fa(7r- 1 N) =J2»i(Ni). 

i i i 

It follows that fij(Nj) = 0. We have proven that each fa is absolutely con- 
tinuous with respect to Aj. This allows us to write dfa = ip[d\i where (p[ is 
a measurable function on Qi. Choose a continuous function ifi on Qi which 
agrees with <pt on the set Si C Qi and write dv = £^ (fidXi. Make this choice 
so that < ifi < 1 and £) f Aj(5i) > 1 — e. It follows that for all n 

||T> - T>|U < ||T> - T>'|U + ||T>' - T>||* < 2e. 

Choose e' > such that d(x,y) < e' implies \<fi(x) — (fi{y)\ < £ 2 for all i. 

Since the skewing maps fk are Lipschitz, so is T. By Rademacher's the- 
orem the derivative D X T (the Jacobian of T at x) exists almost everywhere. 
The structure of the map T tells us that, when it exists, D X T lies in U (we 
need to use the ordered basis (d/dxd, ■ ■ ■ ,d/dxi).) Therefore x i-> D X T is 
a bounded measurable function and we can apply Theorem 11.61 to conclude 
9i/ n (D x T n ) = 9i/ n (Djrn-i x T o • • • o D X T) converges almost everywhere to 
some matrix u € U with 

u hj = A TT / ( D xT)k,k+idm for j > i > 1 and A = > 0. 

fe=i 

The assumption on the loops jk(xk) '■= fk(xo,xi, . . . ,Xk) implies that for 
each k the integral of (D x T)k,k+i with respect to Lebesgue measure on T d 
is a nonzero integer. Indeed, each 7^ is a map T — > T and so represents a 
class [jk] 6 ttx(T) which is isomorphic to Z with isomorphism given by [7] 1— > 
J 'y'dmj (if we think if 7' as taking values in R.) Since the homotopy class of 
7fc is independent of the choice of xq, . . . , x^-i we see that f (D x T)kk + idm 
is again this nonzero integer. 
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We have shown that for every j > i, D x (T n )ij /n- 7 ™* tends uniformly to 
some non-zero number. Assume e was chosen sufficiently small that each 
of these non-zero numbers is greater than e in absolutely value. Fix t > 0. 
We will show that when n is large T™v is nearly invariant under rotation 
by t in the last coordinate. Write R{x\, . . . , x d ) = {x\, . . . , + t) and 
S(xi, . . . , Xd) = S{x\ + 5, . . . , Xd) where 5 will be determined later. Assume 
5 < e'. Then 

\\T n S*v - T>||, < H T "(I^ ° S ~ <Pi\dXi)\U 

i 
i 

<J> 2 <2e 

i 

(since there are approximately 1/e indices i.) 

Fix (xi, . . . , Xd) and let j(s) = (xi(l — s) + {x\ + 5)s, X2, ■ ■ ■ , Xd) be the 
linear curve connecting (x±, . . . , Xd) to S{x\, . . . , x d )- Then 



T n S( Xl , ...,x d )= T n ( Xl , . . . , x d ) + ^ D 1{S) T' 

Jo 

Let 5 = 5{n) = t/(«i d n d_1 )- Then 





\5 ) 



ds. 



lim D^ (s) T n 



( ^ 






= lim t 





n->oo 1 







(«i ) dn d - 1 )- 1 (^7W rn )i,d \ 
(u hd n d - 1 )- 1 (D j{s) T n ) d4 ) 



( t \ 


V o J 



So, by taking n sufficiently large and 5 as above, we can ensure that the maps 
T n S and RT n are point-wise e-close. This implies that \\RT™v -T n S>||* < 
e. Combining results yields 

\\rt?v - r>||* < \\RTy - T n s±v\\+ + \\T n s^ - ry\\+ < 3e. 

Combining yet more results: || RT™/j, — T™//||* 

< ||.RI?/z - i?T>||* + \\RTy - T>|U + ||T> - T>|| 

< 2e + 3e + 2e. 

This completes the proof. □ 
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Question 4.6. Are there skew products on T 2 of the form T(x, y) = (x + 
a,y + f(x)) which are uniquely ergodic but not weakly twisting, or weakly 
twisting but not twisting? Necessarily, f must be non-Lipschitz. 

Question 4.7. In Theorem \1.6\ we take f : X — > U to be bounded and 
measurable. Instead, let us assume that its coordinate functions fij lie in 
IP(X). For what values of p does the theorem still hold? 

5 Skew products with expansive fibers 

In the proofs of Theorems 13.11 and 14. 1( we repeatedly exploit the fact that 
our system is an isometric extension of a simpler system. This allows us 
to perturb our measures in the isometric direction to yield a new measure 
which does not deviate from the original measure under application of T. 
Not all skew products are of this convenient form. Even for skew products 
(a;, y) i— >• (x+a, 2y + f(x)) (on the two dimensional torus) we see exponential 
growth in the y direction. In this case it is not reasonable to expect that 
such strong equidistribution results hold. 

We will now show it is possible that even Lebesgue \i measure supported 
on a horizontal line x i— > (x, yo) may fail to equidistribute. Even worse, for 
some choices of /, we will see that repeated application of T yields measures 
that are never close to Haar measure. The next theorem gives a very precise 
picture of what weak* limits 9 of T™[x look like in this case. It decomposes 
T 2 into two sets. On one set, 9 restricts to Lebesgue measure and on the 
other 9 is supported on a Lipschitz curve. 

Theorem 5.1. Let X = T 2 and let T(x, y) = (x + a,py + f(x)) where a is 
irrational, \p\ > 2 is an integer, and f : T — > T is continuously differentiable. 
Let n be a probability measure on X, supported an the graph of a differen- 
tiable curve x i-> (x,7(x)) whose projection ir+fj, onto the first coordinate is 
absolutely continuous with respect to Haar measure, and let 9 be any weak* 
limit of T™[/,. Let 

X j 

S = {x G T : pj'(x) + —f\* + net) + 0}. 

n=0 P 

Then there exists some (3 S T such that 0\ w -i(p+s) ^ s invariant under vertical 
rotation. In particular, if \x projects to Lebesgue measure on the ciricle, 
then 9\^-x(Qj r g\ is Lebesgue. Furthermore, on each connected component of 
7T _1 (T\ / 9 + 5), 9 is supported on the graph of a Lipschitz curve with Lipschitz 
constant ^y su pI/'I- 
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Notice that this theorem does not require that / be homotopically non- 
trivial. 

Proof. Write t(x) := pj'(x) + Y^=oP n f'( x + na )- First observe that 

n-1 

T n (x,-i{x)) = (x + na,p n ~f(x) + ^p n ~ l ~ k f(x + ka)). 

k=0 

If we take the derivative of the y coordinate we get 

n-1 

A n (x) := pV(z) + Ep"" 1 "*/'^ + 

Therefore |A n (x) — p n_1 r(x)| < ^-sup|/'|. This proves that |A n (x)| — > oo 
when x € S and is bounded by ^jsup|/'| otherwise. Write k = 
In particular, T n (x,j(x)) is K-Lipschitz on T \ S. 

Let 6 be a subsequential limit of T™p and pass to a further subsequence 
along which na converges to some point j3 € T and T n (x,"/(x)) converges 
point-wise to some function f3,n(x)). From now on n will always 

represent an element of this subsequence. Point-wise convergence of func- 
tions on a compact set, all satisfying the same Lipschitz condition, is nec- 
essarily uniform convergence. This proves that 0\ 7T ' 1 (j\(5+s) 1S supported on 
the curve (x + (3, rj(x))\j\s. Furthermore, the limiting curve n is necessarily 
K-Lipschitz on the connected components of T \ S. 

It remains to be shown that 0\ n -i(f3+s) ^ s invariant under vertical rota- 
tion. To make things simpler, instead of T™p, we will study the push-forward 
v n of Lebesgue measure on S by the map x i-> (x,7„(x)), where 

n-1 

7„(x) := p n j(x - na) + '^2p n ~ 1 ~ k f(x + ka - na) 

fc=o 

The only difference between T™u and v n is a horizontal translation. 

We assumed that p projects to some measure absolutely continuous with 
respect to Lebesgue measure on T. It suffices to consider the case where p 
projects to Lebesgue measure, since, if p is a continuous, vertically invariant 
function on T 2 , Ym\pdv n = p\im.di> n , and by Lusin's Theorem, the density 
function for p is continuous on a set of as large measure as we wish. 

Fix s > 0. Let S' = {x : \t(x)\ > e} and assume e is small enough 
that the Lebesgue measure of S' is within e of the Lebesgue measure of S. 
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Choose S > 0, 5 < e such that for any x±, x 2 G 5" with d(x\, x 2 ) < 5 we have 
|t(xi) — r(x2)| < e 2 . Choose iV such that p N eS > 2, and also such that 

(p N ~ l {e + e 2 ) + K) 
(pX-^e-e 2 ) -k) 

and it's reciprocal are both within 3e of 1 (this is possible when e < 1/3.) 

Let n > N. We will construct a partition of S' into intervals [ctj,bj) 
together with an exceptional set £7. Since S' is open, it is union of disjoint 
open intervals. Ignore all intervals having length less than S. Each remaining 
interval we write as a union of as many contiguous intervals [a, b) as possible 
with the property that d(a, b) < 5 and such that, on each [a, b),j n is mono- 
tonic, and 7„(a) = j n (b) = 1 (this is possible because of the first assumption 
on the size of N.) Furthermore, assume 7„(c) / 1 for all a < c < b. Write 
{[ctj,&j) : i € 1} for this collection of intervals. Let E = S — UJ a i>M- ^ ^ 
is sufficiently small than fi(E) < e. Put more succinctly, we break S' into 
intervals on which j n wraps once, monotonically around T together with an 
exceptional set E consisting of 'scraps'. 

Consider any rectangle of the form R := [aj, bi) x [2/1,2/2]. This rectangle is 
intersected by the curve (x, / y n (x)) in exactly one arc. To avoid unnecessary 
cases, let's assume that 7 n is increasing on [ai,bi) (if 7„ is decreasing the 
argument is analogous.) Notice that 7n|[ 0i ,&i) traverses [2/1,2/2] exactly once. 
Let ai < c\ < C2 < &j be such that 7 n (ci) = Hi and 7^(02) = 2/2- Now we 
estimate: for x £ [cij,&j), 

p n ~ 1 r(a;) - k < j' n (x) < p n ~ 1 r(x) + k 
(p n -Hr(a t ) - e 2 ) - «) < 7 » < (p^VK) + e 2 ) + «). 

Integrating over the intervals [aj,6j) and [c\,C2) yields 

{p n ~ l (r (a*) - e 2 ) - «) (bi - a,) < 1 < (p"" 1 (r (a* ) + e 2 ) + «) (^ - a,) 
{p n ~ 1 {T{ai)-e 2 )-K){c 2 -c l )< 2/2-2/1 < (p n_1 (r(ai) + e 2 ) + K)(c 2 -ci) 

It follows that 

{p n - l (T{ ai ) - e 2 ) - k)(c 2 - ci) ^ (p"" 1 ^) + e2 ) + K )( c 2 " <*) 

< 2/2-2/1 < 



(p^e-e 2 ) -«)(c 2 - ci) (p^He + e 2 ) + k)(c 2 - ci) 



( p n-l (£ + £ 2 ) + / , )(6 ._ a . ) < ^ 2/1 < ( ^l (e _ £2) _ K)(6i _ ai) 

(l_ 3e )^l< 2/2-2/1 <(l + 3e)£^. 
bi - ai hi - ai 

(1 - 3e)v n (R) < m(R) < (1 + 3e)v n (R), 
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since v n (R) = C2 — c\ and m(R) = (1/2 — yijipi — a,j). If we let / be any 
1-Lipschitz function on T 2 with |/| < 1, we see that 



/ fdv n - \ fdm 

Jtt- 1 S' Jtt' 1 S / 



< 3e + 25 so, ||z/ n | 7r -i5/ — m| 7r -i5/||* < 5e. 



Finally, since u n and m both give 7r _1 (S' \ S') measure at most e, we have 
ll^nU-is — J7i l7r- 1 fif||* < 7e. Since this holds for all n larger than N, the 
theorem is proven. □ 

The next (trivial) Corollary gives an extremely non-optimal estimate on 
the size of S. Its value is in providing an easy method for over-estimating 
the (failure of) equidistribution of the sequence T™[i for any differentiable 
skewing factor / whatsoever. 

Corollary 5.2. The set S on which the conclusions of Theorem I5.il hold 
satisfies 

KS) > 1 - m({x : \f'(x) + p~f'(x)\ < sup l/'l^-r}) =: 1 - /3. 

p — 1 

In particular, if (p is any continuous test function on T 2 and fi projects to 
Haar measure on the first coordinate then 



lim sup 



(fdm — / (p oT dfi 



< /? sup |v?|. 



Proof. If x £ S then t(x) = (where r is defined as in the proof of Theorem 
15.11 ) A simple application of the triangle inequality yields 

\f'{x)+p 1 \x)\ < T\p- n f'(x + na)\ < sup |f! -, 

ti P- 1 

which proves the first claim. The second claim is a trivial consequence of 
the first. □ 

Corollary 5.3. Let (X,T),7 be as in Theorem \5.1\ and suppose \i projects 
to Lebesuge measure on the first coordinate. If f and 7 are analytic, then 
either T+fj, tends to Lebesgue measure or f(x) = j(Tx) — p^(x) plus some 
constant. 
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Proof. If / and 7 are analytic, then so are /, 7', and r. If the set of zeros of 
r is a null set, then by Theorem 15.11 T™fx tends toward Lebesgue measure. 
Otherwise, analyticity of r implies r = 0. The relation 



t(x) = p^'(x) + f'(x) + p 1 (t(Tx) — pj'(Tx)) 
reduces to f'(x) = j'(Tx) — p^f'(x). 



□ 



Example 5.4. Let [i be Lebesgue measure on the horizontal line (x, yo) and 
let T(x,y) = (x + a,2y + fix)). If / is strictly monotonic then \t(x)\ > 0. 
So, by Theorem 15 .1\ T™u equidistributes. 

Example 5.5. Let / be arbitrary and let 7' agree with 



and disagree elsewhere. Then S = (1/2,1). Let u be the measure on the 
graph of 7 which projects to Lebesgue measure on the first coordinate. Let 
9 be any weak* limit of T™fi and apply Theorem 15.11 to conclude that 9 is 
equal to Lebesgue measure on a translate 7r _1 (/? + S) and is supported on a 

sup |/'|-Lipschitz function elsewhere. The same is true of T™6 for all n. 

Because of the shape of the support of 9 on vr _1 (/3 + 5), there is some 
constant e (depending only on the Lipschitz constant sup |/'|) such that 
one can always find a ball of radius e disjoint from the support. It is easy 
to see that there is some smaller constant e' > such that \\T™8 — m||* > e' 
for all n. So we can choose a weak* limit of i Yln=o = ~k ^n=o &T?e 

to get an invariant measure r/ G P(JP\^ which must be different from 5 m . 

Now take / to be the function f(x) = x. In this case, further analysis, 
reveals that on the complement of vr _1 (/3 + S), 9 is supported on a line with 
s l°P e ^rj • T™6 also has this property. So T™9 = {not, y n ) + 9 for some y n € T 
(here, (not, y n )+9 represents pushforward of 9 under addition of (na, y n )-) It 
follows that the orbit closure Y := {T™9 : n} is homeomorphic to some closed 
subset of T 2 . In fact, suppose (f3, yo) is one endpoint of the line which is the 
support of 9 on the complement of vr _1 (/3 + S). Then (Y,T*) is isomorphic 
as a topological system to the orbit closure of this point {T n (f3, yo) : n € Z} 
with isomorphism given by (x, y) h-> (x — /3,y — yo) + 9. 

The system (Y, T*) has, as a factor, the rotation by a on T. The factor 
map is given by Y 3 (x, y) + 9 1— > x. So, the measure r\ £ P(Y) C P(P\) 
we constructed above must project to the Lebesgue measure on the rotation 
factor. This provides an example of an invariant measure on P\ which is not 
a convex combination of delta-masses at fixed points. 



00 




on [0,1/2] 



n=0 
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